missense variant
Dynamicasome: a molecular dynamics-guided and AI-driven pathogenicity prediction catalogue for all genetic mutations
Islam, Naeyma N, Coban, Mathew A, Fuller, Jessica M, Weber, Caleb, Chitale, Rohit, Jussila, Benjamin, Brock, Trisha J., Tao, Cui, Caulfield, Thomas R
Advances in genomic medicine accelerate the identi cation of mutations in disease-associated genes, but the pathogenicity of many mutations remains unknown, hindering their use in diagnostics and clinical decision-making. Predictive AI models are generated to combat this issue, but current tools display low accuracy when tested against functionally validated datasets. We show that integrating detailed conformational data extracted from molecular dynamics simulations (MDS) into advanced AI-based models increases their predictive power. We carry out an exhaustive mutational analysis of the disease gene PMM2 and subject structural models of each variant to MDS. AI models trained on this dataset outperform existing tools when predicting the known pathogenicity of mutations. Our best performing model, a neuronal networks model, also predicts the pathogenicity of several PMM2 mutations currently considered of unknown signi cance. We believe this model helps alleviate the burden of unknown variants in genomic medicine.
- North America > United States > Florida > Duval County > Jacksonville (0.04)
- North America > United States > Minnesota > Olmsted County > Rochester (0.04)
- North America > United States > Oregon > Lane County > Eugene (0.04)
- (3 more...)
Fine-tuning Protein Language Models with Deep Mutational Scanning improves Variant Effect Prediction
Lafita, Aleix, Gonzalez, Ferran, Hossam, Mahmoud, Smyth, Paul, Deasy, Jacob, Allyn-Feuer, Ari, Seaton, Daniel, Young, Stephen
Protein Language Models (PLMs) have emerged as performant and scalable tools for predicting the functional impact and clinical significance of protein-coding variants, but they still lag experimental accuracy. Here, we present a novel fine-tuning approach to improve the performance of PLMs with experimental maps of variant effects from Deep Mutational Scanning (DMS) assays using a Normalised Log-odds Ratio (NLR) head. We find consistent improvements in a held-out protein test set, and on independent DMS and clinical variant annotation benchmarks from ProteinGym and ClinVar. These findings demonstrate that DMS is a promising source of sequence diversity and supervised training data for improving the performance of PLMs for variant effect prediction.
DeepMind's New AI Can Predict Genetic Diseases
About 10 years ago, Žiga Avsec was a PhD physics student who found himself taking a crash course in genomics via a university module on machine learning. He was soon working in a lab that studied rare diseases, on a project aiming to pin down the exact genetic mutation that caused an unusual mitochondrial disease. This was, Avsec says, a "needle in a haystack" problem. There were millions of potential culprits lurking in the genetic code--DNA mutations that could wreak havoc on a person's biology. Of particular interest were so-called missense variants: single-letter changes to genetic code that result in a different amino acid being made within a protein.
VEGN: Variant Effect Prediction with Graph Neural Networks
Cheng, Jun, Lawrence, Carolin, Niepert, Mathias
Genetic mutations can cause disease by disrupting normal gene function. Identifying the disease-causing mutations from millions of genetic variants within an individual patient is a challenging problem. Computational methods which can prioritize disease-causing mutations have, therefore, enormous applications. It is well-known that genes function through a complex regulatory network. However, existing variant effect prediction models only consider a variant in isolation. In contrast, we propose VEGN, which models variant effect prediction using a graph neural network (GNN) that operates on a heterogeneous graph with genes and variants. The graph is created by assigning variants to genes and connecting genes with an gene-gene interaction network. In this context, we explore an approach where a gene-gene graph is given and another where VEGN learns the gene-gene graph and therefore operates both on given and learnt edges. The graph neural network is trained to aggregate information between genes, and between genes and variants. Variants can exchange information via the genes they connect to. This approach improves the performance of existing state-of-the-art models.